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8.1.1 Random Sampling 

When collecting data, we often make several observations on a random variable. For example, 
suppose that our goal is to investigate the height distribution of people in a well defined 
population (i.e., adults between 25 and 50 in a certain country). To do this, we define random 
variables X\, X2, X3, . . ., X n as follows: We choose a random sample of size 71 with 
replacement from the population and let X\ be the height of the 2th chosen person. More 
specifically, 

1. We chose a person unifonnly at random from the population and let X 1 be the height of 
that person. Here, every person in the population has the same chance of being chosen. 

2. To determine the value of X 2 , again we choose a person uniformly (and independently 
from the first person) at random and let X 2 be the height of that person. Again, every 
person in the population has the same chance of being chosen. 

3. In general, X{ is the height of the 2 th person that is chosen uniformly and 
independently from the population. 


You might ask why do we do the sampling with replacement? In practice, we often do the 
sampling without replacement, that is, we do not allow one person to be chosen twice. 
However, if the population is large, then the probability of choosing one person twice is 
extremely low, and it can be shown that the results obtained from sampling with replacement 
are very close to the results obtained using sampling without replacement. The big advantage 
of sampling with replacement (the above procedure) is that X{' s will be independent and this 
makes the analysis much simpler. 

Now for example, if we would like to estimate the average height in the population, we may 
define an estimator as 


A Xi + X2 + • • • + X n 

fc) = - . 

n 

The random variables X\, X2, X3, . . X n defined above are independent and identically 
distributed (i.i.d.) and we refer to them collectively as a (simple) random sample. 

The collection of random variables X\, X2, X3, . . X n is said to be a random sample of 
size 71 if they are independent and identically distributed (i.i.d.), i.e., 

1. X\, X2, X3, . . X n are independent random variables, and 

2. they have the same distribution, i.e, 

F Xl {x) = Fx 2 (x) =...= F Xn {x ), for all iGl. 
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In the above example, the random variable 0 = is called a point estimator 

for the average height in the population. After perfonning the above experiment, we will 

A A A 

obtain 0 = 6 . Here, 9 is called an estimate of the average height in the population. In 

A 

general, a point estimator is a function of the random sample 0 = /l(X j, X 2 , • • • , X n ) 
that is used to estimate an unknown quantity. 

It is worth noting that there are different methods for sampling from a population. We refer to 
the above sampling method as simple random sampling. In general, "sampling is concerned 
with the selection of a subset of individuals from within a statistical population to estimate 
characteristics of the whole population" [18]. Nevertheless, for the material that we cover in 
this book simple random sampling is sufficient. Unless otherwise stated, when we refer to 
random samples, we assume they are simple random samples. 

Some Properties of Random Samples: 

Since we will be working with random samples, we would like to review some properties of 
random samples in this section. Here, we assume that X\ , X 2 , X3, .... X n are a random 
sample. Specifically, we assume 

1. the Xi's are independent; 

2. F Xl (x) = Fx 2 (x) =...= F Xn {x) = F x (x); 

3. EXi — EX — fi <C OO; 

4. 0 < Var(Xi) = Var(X) = cr 2 < 00 . 

Sample Mean: 

The sample mean is defined as 

- = Xi + x 2 +... +x n 

n 

Another common notation for the sample mean is M n . Since X ? ; are assumed to have the 
CDF F x {x), the sample mean is sometimes denoted by M n (X) to indicate the distribution 

of Xj'8. 

Properties of the sample mean 

1 .EX = /jl. 

2 . Var(X) = 4 - 

3. Weak Law of Large Numbers (WLLN): 

lim P(|X — n\ > e) = 0 . 

n —>00 
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4. Central Limit Theorem: The random variable 


Z n — 


X — [A X\ “h X 2 -\~. • • ~\~X n — 


na 


converges in distribution to the standard normal random variable as 71 goes to infinity, 
that is 


lim P(Z n < x) = $(#), for all x £ R 

n—>■ oo 

where $(tc) is the standard normal CDF. 

Order Statistics: 

Given a random sample, we might be interested in quantities such as the largest, the smallest, 
or the middle value in the sample. Thus, we often order the observed data from the smallest to 
the largest. We call the resulting ordered random variables order statistics. More specifically, 
let X\, X‘ 2 , X 3 , . . ., X n be a random sample from a continuous distribution with CDF 
F x (t). Let us order X{S from the smallest to the largest and denote the resulting sequence 
of random variables as 


X(i),X(2), • * • ) X(n) • 

Thus, we have 

X (1) = min (X 1 ,X 2 , • • •, X n ); 

and 

X( n ) m&X ^ X\ ? -<^25 ; X n ^ . 

We call X ( 1 ), ^(2) , • • • , -X”( n ) the order statistics of the random sample X\, X 2 , X3, . . ., 
X n . We are often interested in the PDFs or CDFs of the X^'s. The following theorem 
provides these functions. 

Theorem 8,1 

Let X\, X 2 , . . ., X n be a random sample from a continuous distribution with CDF Fx (x) 
and PDF fx{x). Let -^( 1 ), X( 2 )i ‘ > -X”( n ) be the order statistics of X\ , X 2 , X3, . . ., 

X n . Then the CDF and PDF of X ^ are given by 


https://www.probabilitycourse.com/chapter8/8_1_1_random_sampling.php 


3/5 



9 / 18/2018 


Random Sampling 


/*,„ (*) 


n! 


fx{x)[Fx{x)]' 1 [l-Fx(x)] n \ 


(i — l)!(n — z)! 


Fx m (*) = (”) [fxW] fc [l-JxW] 


k=i 


i n—k 


Also, the joint PDF of X ^, -^-(2) 1s given by 

fx { 1 ) ,---,X {n) (x l,® 2 ,'•*>»«) = 

n\fx(xi)fx(x 2 ) • • • /x(Zn) for x 1 < x 2 < x 2 • • • < x 


0 


otherwise 


A method to prove the above theorem is outlined in the End of Chapter Problems section. Let’s 
look at an example. 


Example 8.1 

Let X\, X 2 , X%, X/[ be a random sample from the Uniform(0 , 1) distribution, and let 
X(i) , X( 2 ), X( 3 ) , X( 4 ) . Find the PDFs of X^y X( 2 ), and X( 4 ) . 

• Solution 

o Here, the ranges of the random variables are [0, 1], so the PDFs and CDFs are 
zero outside of [0,1]. We have 

fx{x) — 1, for x G [0,1], 

and 


F x (x) = X, 

By Theorem 8.1. we obtain 

4! 

fx,Jx) 


for x G [0,1]. 


fx (x) [Fx (»)] [l - Fx (»)] 


(1 — 1)!(4 — 1)! 

= 4 f x (x) [l-FxO*)] 3 
= 4(1 — tc) 3 , for x G [0,1] 


i 4-1 
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fx mW = ( 2 - l)f(4 - 2 )M x ^ Fx ^ 2 

= 12f x (x)F x {x)[l - F x {x)] 2 

— 12x(l — cc) 2 , for x G [0,1]. 

W*) = (4 — l)t(4 — 4)! [-f’x( a: )] 4 

= 4 /x(«) [F x ( x )f 

— 4cc 3 , for x G [0,1]. 


<— previous 
next —» 
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